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Abstract 

Optimizing the Fisher ratio is well established in statistical pattern recognition as a 
means of discriminating between classes. I show how to optimize that ratio for optical 
correlation intensity by choice of filter on an arbitrary spatial light modulator (SLM). I 
include the case of additive noise of known power spectral density. 

1. Introduction 

There is a long and venerable history 1 in optical correlation pattern recognition 
(OCPR) of building filters to permit the discrimination between two classes of objects on 
the basis of their correlation intensities. Usually we would like for objects in the “accept” 
class to have large intensity and conversely for the “reject” class. The classical Fisher 
linear discriminant (FLD) reduces a highly dimensioned (and possibly complex) vector to 
a single quantity with the intention that it can be thresholded as a discriminant between 
classes. The classical FLD operates as a linear transform of the input object, and its 
optimizing filter best separates the means of the transformed classes, as normalized to 
their widths. In general it is a good metric to optimize, and under some circumstances 
(e.g. identical normal distributions) the FLD is an exactly optimal (minimal error) 
classifier. I have not previously seen the Fisher ratio analytically optimized for OCPR. 

In OCPR we work with intensities, not just the complex field amplitude that is the last 
linear stage in the optical correlation process. Thus we can not tell the difference 
between complex correlations of the same amplitude even if their values separate well in 
the complex plane. We take the Fisher ratio for OCPR to be the squared difference 
between mean correlation intensities for two classes of objects, divided by the sum of the 
correlation variances for those two classes (see Eq. (1)). I show how to maximize the 
optical Fisher ratio by choice of a filter to be realized on an arbitrary SLM. I include 
additive noise in determining the normalizing variances. 

2. Formulating the problem 

We assume that there is an optimal filter; a necessary condition for its optimality is 
that the partial derivative of the filter with respect to allowed changes is zero. In practice 
this laboratory has not found any problem with local maxima, nor with the fact that the 
specification for the worst filter nominally looks the same. A frequency s filter value can 
not be optimally chosen without regard to all other frequencies’ filter values, the signals 
to be discriminated, and the noise. An explicit feature here is that all such information is 
condensed to a comparatively very few parameters to search over (there is essentially one 
complex scalar per training image to search over - not bad, compared with the typical 
filter’s tens of thousands of frequencies!). Then the optimal filter value at the frequency 
under consideration must be chosen from the set of realizable values. 

We adopt the following nomenclature. C is a class (A for accept, 'F for reject). 
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frequencies (we use the one-dimensional notation), and the correlation intensity is 
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We assume the effect of noise on intensity has zero mean (else its effect can be 
incorporated into the class means for intensity, and so we will take no further notice of it 
here). The total variance is modeled as (J 2 CT =0 2 c + 0 2 n . With these definitions we can set 
up the optical Fisher ratio, J, and optimize it. 
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defining the numerator and denominator of J. 

3. Optimizing the filter 

The optimization strategy is based on that of Juday 2 ' 3 . We take the radial 
component of the gradient of J in the complex plane of values for H m , and from that 

infer the azimuthal component, and thus deduce the optimal realizable values for H m . 


Taking the radial derivative in the complex plane of H m , 
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and we see that we need several partial derivatives. As shown by Juday , 
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and restriction to a class and taking the expectation is straightforward. From the 
modeling of the variance, 
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In a little more detail than in the definition, 
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Reorganizing and inserting Eq. (6) into Eq. (4), 


30- 


cr 


dM. 


COS0 m ~^- tX( 7 ' -( 7 )c)[ B iA- C0S (A - <t> im )-~r'L B t A tm cos ( A -40 
Af c — 1/sC ' V C/€C 


+sin0 m {similar} 

+2M m P nm . 

Now consider an operator □ such that 

exp(j'0)D Rexp(ja) = /?cos0cosa + /?sin0sina 
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gives the projection of Rexp(ja) onto the unit vector in the complex direction exp(y'0) . 
We regard an equation like the first part of Eq. (7) as expressing the gradient of a quantity 
as it interacts with the complex unit vector exp(y'0) - and further, that the gradient is 
uniform [later we shall look at the last term in Eq.(7)]. From this perspective we build 
V m J , the gradient of J as a function of position in the complex plane of values for H m . 
We assemble Eqs.(3) and (7) into Eq. (2), with the result 
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In the term of Eq. (9) that is the product of sums, we interchange summation order and 
swap indices so that it becomes a sum on S * m . Then we can express V m 7 as a sum over 

the training images’ spectral conjugated transforms S' im (and the M m P nm term). 



Den' 


v./=is,: 


+ 


ie A 


I Sl 


ie S' 


DDenVNum D 1 Num(/ < -(/) A ) _ D,Num 

+ ~ l ~V a n a - i ^ A (iv A -i)« l< UJ 

D ; DenVNum Z),Num(/, - {/),,,) D-Num _/,\ \ 

aT iv T -i WJ 


-NumM/ nffl 


(10) 


The terms in square braces are not functions of m, but there is one for each reference 
image. Therefore each can be replaced by a complex constant T t , now giving 
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which implicitly defines the set of complex coefficients {7);ie Au'Funoue}, one per 

training object plus a real- valued one for the input noise (if that noise is present). The 
coefficients represent the necessary information as mentioned in the first paragraph of 
Section 2. We do not know the coefficients a priori. However, we know they exist, and 
we can search for their values and confirm them by comparing Eqs. (10) and (1 1) when 
we have maximized J in the search. Following the gradient-of-metric logic developed in 
section 13 of Juday 2 , Eq. (11) is our guide to the selection of the optimal filter value for 
the m-th frequency. If P nm is zero, we select for H m the realizable value that has the 

largest projection in the direction of V m 7 . If P nm is not zero, we compute an ideal value 
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and then select for H m the realizable value that is closest by Euclidean measure. The 
effect of the additive noise is to change the maximum-projection filter toward matched- 
filter behavior. 

Interestingly, there is a strong similarity with the initial formulation of the synthetic 
discriminant function 4 (SDF) for noiseless input. In that approach a linear sum of 
training images was sought that would cause exactly the desired central correlation 
intensities. There were two flaws in that approach; the computed filter was not realizable 
and the tools to handle the mapping onto a filter encoding domain were not at hand, and 
the method needlessly specified certain complex correlation values that were not founded 
in the observation of intensity. 


4. Physical results 

We have gotten confirming optical bench results, but there is not room to show 
them here. A subsequent paper will explore some practical issues including search 
strategies, convergence in the search, limitation on how many training images can be put 
into a filter, selecting among operating curves for various noise environments, etc. 
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